Ensembles of Proximity-Based One-Class Classifiers for Author Verification Notebook for PAN at CLEF 2014
نویسندگان
چکیده
We use ensembles of proximity based one-class classifiers for authorship verification task. The one-class classifiers compare, for each document of the known authorship, the dissimilarity between this document and the most dissimilar other document of this authorship to the dissimilarity between this document and the questioned document. As the dissimilarity measure between documents we use Common N-Gram dissimilarity based on character or word n-grams.
منابع مشابه
Proximity Based One-class Classification with Common N-Gram Dissimilarity for Authorship Verification Task Notebook for PAN at CLEF 2013
We describe our participation in the Author Identification task of the PAN 2013 competition. This competition task presents participants with a set of authorship verification problems. In each such a problem, one is given a set of documents written by one author and a sample document; the task is to answer the question whether or not the sample document was written by the same author as the rem...
متن کاملStyle-based Distance Features for Author Verification Notebook for PAN at CLEF 2013
In this paper we present the approach we took in our participation to the PAN 2013 Author Profiling task. It is an adaptation of our system submitted for author identification, assuming that a profile category (authors belonging to the same gender and age group categories) can be analyzed in the same way as an author’s style.
متن کاملAuthorship Verification: An Approach based on Random Forest: Notebook for PAN at CLEF 2015
Authorship attribution, being an important problem in many areas including information retrieval, computational linguistics, law and journalism etc., has been identified as a subject of increasingly research interest in the recent years. In case of Author Identification task in PAN at CLEF 2015, the main focus was given on cross-genre and cross-topic author verification tasks. We have used seve...
متن کاملAuthor Verification: Exploring a Large set of Parameters using a Genetic Algorithm - Notebook for PAN at CLEF 2014
In this paper we present the system we submitted to the PAN’14 competition for the author verification task. We consider the task as a supervised classification problem, where each case in a dataset is an instance. Our system works by applying the same combination of parameters to every case in a dataset. Thus, the training stage consists in finding an optimal combination of parameters which ma...
متن کاملUsing Simple Content Features for the Author Profiling Task Notebook for PAN at CLEF 2013
This paper describes the methods we have employed to solve the author profiling task at PAN-2013. Our goal was to use simple features to identify the age group and the gender of the author of a given text. We introduce the features, detail how the classifiers were trained, and how the experiments were run.
متن کامل